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Abstract 

There has been considerable work in AI on decision- 
theoretic planning and planning under uncertainty. 
Unfortunately, all of this work suffers from one or more of 
the following limitations: 1) it relies on very simple models 
of actions and time, 2) it assumes that uncertainty is 
manifested in discrete action outcomes, and 3) it is only 
practical for very small problems. For many real world 
problems, these assumptions fail to hold. A case in point is 
planning the activities for a Mars rover. For this domain none 
of the above assumptions are valid: 1) actions can be 
concurrent and have differing durations, 2) there is 
uncertainty concerning action durations and consumption of 
continuous resources like power, and 3) typical daily plans 
involve on the order of a hundred actions. We describe the 
rover problem, discuss previous work on planning under 
uncertainty, and present a detailed, but very small, example 
illustrating some of the difficulties of finding good plans. 

The Problem 

Consider a rover operating on the surface of Mars. On a giv- 
en day, there are a number of different scientific observa- 
tions or experiments that the rover could perform, and these 
are prioritized in some fashion (each observation or experi- 
ment is assigned a scientific value). Different observations 
and experiments take differing amounts of time and con- 
sume differing amounts of power and data storage. There 
are, in general, a number of constraints that govern the rov- 
er’s activities: 

• There are time, power, data storage, and positioning 
constraints for performing different activities. Time con- 
straints often result from illumination requirements - that 
is, experiments may require that a target rock or sample be 
illuminated with a certain intensity, or from a certain an- 
gle. 

• Experiments have setup conditions (preconditions) that 
must hold before they can be performed. For example, the 
rover will usually need to be at a particular location and 
orientation for each experiment and will need instruments 


turned on, initialized, and calibrated. In general, there may 
be multiple ways of achieving some of these setup condi- 
tions (e.g. different travel routes, different choice of cam- 
eras). 

• The amount of power available varies according to the 
time of day, since solar flux is a function of the angle of 
the sun. 

Given these constraints, the objective is to maximize scien- 
tific return for the rover - that is, find the plan with maximal 
utility. Unfortunately, for many rover activities, there is in- 
herent uncertainty about the duration of tasks, the power re- 
quired, the data storage necessary, the position and 
orientation of the rover, and environmental factors that influ- 
ence operations, e.g . , soil characteristics, dust on the solar 
panels, ambient temperature, etc. 

For example, in driving from one location to another, the 
amount of time required depends on wheel slippage and 
sinkage, which varies depending on slope, terrain rough- 
ness, and soil characteristics. All of these factors also influ- 
ence the amount of power that is consumed. The amount of 
energy collected by the solar panels during this traverse de- 
pends on the length of the traverse, but also on the angle of 
the solar panels. This is dictated by the slope and roughness 
of the terrain. 

Similarly, for certain types of instruments, temperature 
affects the signal to noise ratio and, hence, affects the 
amount of time required to collect useful data. Since the 
temperature varies depending on the tune of day and the 
weather conditions, this duration is uncertain. The amount 
of power used depends upon the duration of the data collec- 
tion. The amount of data storage required depends on the ef- 
fectiveness of the data compression techniques, which 
ultimately depends on the nature of the data collected. 

In short, this domain is rife with uncertainty. Plans that do 
not take this uncertainty into account usually fail miserably. 
In fact, it has been estimated that the 1997 Mars Pathfinder 
rover spent between 40% and 75% of its time doing nothing 
because of plan failure. 
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One way to attack this problem is to rely on real-time or 
reactive replanning when failures occur. While this capabil- 
ity is certainly desirable, there are several difficulties with 
exclusive reliance on this approach: 

• Spacecraft and rovers have severely limited computa- 
tional resources due to power limitations and radiation 
hardening requirements. As a result, it is not always feasi- 
ble to do timely onboard replanning. 

• Many actions are potentially risky and require pre-ap- 
proval by mission operations personnel. Because of the 
cost and difficulty of communication, the rover receives 
infrequent command uplinks (typically one per day). As a 
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for safety well in advance. 

• Some contingencies require anticipation; e.g., switch- 
ing to a backup system may require that the backup sys- 
tem be wanned up in advance. For time critical operations 
such as orbit insertions or landing operations there is in- 
sufficient time to perform these setup operations once the 
contingency has occurred, no matter how fast the planning 
can be done. 

For these reasons, it is sometimes necessary to plan in ad- 
vance for potential contingencies — that is, anticipate unex- 
pected outcomes and events and plan for them in advance. 

The problem that we have just described is essentially a 
decision-theoretic planning problem. More precisely, the 
problem is to produce a (concurrent) plan with maximal ex- 
pected utility, given the following domain information: 

• A set of possible goals that may be achievable, each of 
which has a value or reward associated with it. 

• A set of initial conditions, which may involve uncer- 
tainty about continuous quantities like temperature, en- 
ergy available, solar flux, and position. This 
uncertainty is characterized by probability distribu- 
tions over the possible values. 

• A set of possible actions, each of which is character- 
ized by: 

- a set of conditions that must be true before the 
action can be performed. (These may include metric 
temporal constraints as well as constraints on 
resource availability.) 

- an uncertain duration characterized by a probability 
distribution. 

- a set of certain and uncertain effects that describe 
the world following the action. Uncertain effects on 
continuous variables are characterized by probabil- 
ity distributions. 

Decision-theoretic planning is already known to be quite 
hard both in theory [18] and in practice. However, there are 
some characteristics of this domain, which, when taken to- 
gether, make this planning problem both difficult and differ- 
ent from the kinds of problems that have been studied in the 
past: 

• Time - actions take differing amounts of time and con- 


currency is often necessary. 

• Continuous outcomes - most of the uncertainty is as- 
sociated with continuous quantities like time and pow- 
er. In other words, actions do not have a small number 
of discrete outcomes. 

• Problem size - a typical daily plan for a rover will in- 
volve on the order of a hundred actions. 

While we have described this scenario for a rover, this kind 
of problem is not limited to robotics or even space applica- 
tions. For example, in a logistics problem, travel durations 
are influenced by both traffic and weather considerations. 
Fuel use is likewise influenced by these “environmental ’ 
factors There are temporal constraints on the availability 
and delivery of cargo, as well as on the availability of both 
facilities and crew. There are also constraints on fuel loading 
and availability, and on maintenance operations. 

Previous Work 

There has been considerable work in AI on planning under 
uncertainty. Table 1 classifies much of this work along the 
following two dimensions: 

• Representation of uncertainty - whether uncertainty 
is modeled strictly logically, using disjunctions, or is 
modeled numerically, with probabilities. 

• Observability assumptions - whether the uncertain 
outcomes of actions are not observable, partially ob- 
servable, or fully observable. 



Disjunction 

Probability 

Non-Observable 

CGP [31] , 
CMBP [9. 1] 
C-PLAN [13, 8] 
Fragplan [16] 

Buridan [17] 
UDTPOP [23] 

Partially- 

Observable 

SENSp [12] 
Cassandra [25] 
PUCCINI [H] 
SGP [34] 
QBF-Plan [27] 
GPT [6] 
MBP [2] 

C-Buridan [10] 
DTPOP [23] 
C-MAXPLAN [19] 
ZANDER [19] 
Mahinur [22] 
POMDP [7] 

Fully-Observabie 

WARPLAN-C [33] 
CNLP [24] 

JIC [11] 

Plinth [15] 
Weaver [4] 

PGP [3] 

MDP [7] | 


Table 1: A classification of planners that deal with uncertainty. 
Planers in the top row are often referred to as conformant 
planners, while those in the other two rows are often referred to as 


contingency planners. 

We do not discuss this work in detail here. A survey of some 
of this work can be found in Blythe [5] . A more detailed sur- 
vey of work on MDPs and POMDPs can be found in Boutil- 
ier. Dean and Hanks [7]. Instead we will focus on why this 
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work is generally not applicable to the rover problem and 
what can be done about this. 

There are a number of difficulties in attempting to apply 
existing work on planning under uncertainty to spacecraft or 
rovers. First of all, the work listed in Table 1 assumes a very 
simple model of action in which concurrent actions are not 
permitted, explicit time constraints are not allowed, and ac- 
tions are considered to be instantaneous. As we said above, 
none of these assumptions hold for typical spacecraft or rov- 
er operations. These characteristics are not as much of an ob- 
stacle for Partial-Order Planning frameworks such as 
SENSp [12], PUCCINI [14], WARPLAN-C [33], CNLP [24], 
Buridan [17], UDTPOP [23], C-Buridan [10], DTPOP [23], 
Mahinur [22] and Weaver [4]. In theory, these systems could 
represent plans with concurrent actions and complex tempo- 
ral constraints. The requirements for a rich model of time 
and action are more problematic for planning techniques 
that are based on the MDP or POMDP representations, sat- 
isfiability encodings, the graphplan representation, or state- 
space encodings. These techniques rely heavily on a discrete 
model of time and action. (See [30] for a more detailed dis- 
cussion of this issue.) Although semi-Markov decision pro- 
cesses (SMDPs) [26] can be used to represent actions with 
uncertain durations, they cannot model concurrent actions 
with complex temporal dependencies. The factorial MDP 
model has recently been developed to allow concurrent ac- 
tions in an MDP framework. However, this model is limited 
to discrete time and state representations. Moreover, existing 
solution techniques are either too general to be efficient on 
real-world problems ( e.g . Singh and Cohn [28]), or too do- 
main-specific to be applicable to the rover problem (e.g. 
Meuleau et at [20]). 

A second, and equally serious, problem with existing 
contingency planning techniques is that they all assume that 
uncertain actions have a small number of discrete outcomes. 
For example, in the representation popularized by Buridan 
and C-Buridan, a rover movement action might be character- 
ized as shown in Figure 1 . In this representation, each arrow 



Figure 1: A C-Buridan action for movement. 

to a propositions on the right indicates a possible outcome of 
the action, along with the associated probability of that tran- 
sition. 3 To characterize where a rover could end up after a 
move operation, we have to list all the different possible dis- 
crete locations. We would need to do something similar to 


3. We have omitted some details here. For each transition, there is 
a condition that the rover must be at location [1,1] to start with, 
and that the rover is no longer at [1,1] for each outcome. 


characterize pow'er usage. For most spacecraft and rover ac- 
tivities this kind of discrete representation is impractical - 
most of the uncertainty involves continuous quantities, such 
as the amount of time and power an activity requires. Action 
outcomes are distributions over these continuous quantities. 
There is some recent work using models with continuous ac- 
tion outcomes in both the MDP [29, 21] and POMDP [32] 
literature, but this has not yet been applied to SMDPs and 
has primarily been applied to reinforcement learning rather 
than planning problems. 

Ultimately, the state that results from performing an 
action determines the future actions that will be taken, so in 
this sense an action's outcomes are discretized. However, 
this discretization is not a static property of toe actions- 
instead, it depends on what goals or subgoals the planner is 
trying to accomplish. For example, suppose that the rover is 
trying to move to a certain location. If the objective is to 
place an instrument on a particular rock feature, then the 
tolerance in position is quite small. In contrast, if the objec- 
tive is to take a picture from a different vantage point, then 
the tolerance can be significantly larger. 

A third problem with conventional contingency planning 
technology is that it does not scale to larger problems. Part 
of the problem is that most of the algorithms attempt to ac- 
count for all possible contingencies. In effect, they try to 
produce policies. For spacecraft and rover operations, this is 
not realistic or tractable - a daily plan can involve on the or- 
der of a hundred operations, many of which have uncertain 
outcomes that can impact downstream actions. The resulting 
plans must also be simple enough that they can be under- 
stood by mission operators, and it must be feasible to do de- 
tailed simulation and validation on them in a limited time 
period. This means that a planner can only afford to plan in 
advance for the “important” contingencies and must leave 
the rest to run-time replanning. Of the planning systems dis- 
cussed above, only Just-In-Case (JIC) contingency schedul- 
ing [11] and Mahinur [22] exhibit a principled approach to 
choosing what contingencies to focus on. We will discuss 
this approach in more detail later. 

A Detailed Example 

In order to illustrate the problem further, in this section we 
give a detailed example of a very small rover problem. Fig- 
ure 2 shows a “primary” plan and two potential branches. 
The primary plan consists of approaching a target point (Vi- 
suaiServo), digging the soil (Dig), backing up (Drive), and- 
taking spectral images of the area (NIR). One potential alter- 
nate branch consists of replacing the spectral image with a 
high-resolution camera image of the target (Hi res). A sec- 
ond potential branch consists of taking a low-resolution pan- 
orama of the area (Lo res), performing on-board image 
analysis to find rocks in the panorama (Rock finder), and 
then taking spectral images of the rocks found (NIR). For 
this example, we assume that energy is only being depleted. 
(More generally, a rover would also be receiving energy in- 
put from charging. 
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Figure 2: A detailed rover problem. A “main” plan, and two 
possible alternative branch plans are shown. Probability 
distributions for time and energy usage are shown for each action. 
Time and energy constraints for actions are shown in bold. 

Precedence coh^riamts areriehoted t>y arrows in the fig- 
ure; for example, since HiRes can only be performed after 
Drive, there is an arrow from Drive to HiRes. For each ac- 
tion, there may be preconditions, expectations, and a local 
utility; in the figure, these appear above the plan actions. The 
preconditions specify under what conditions execution of 
the action may start. The expectations describe the expected 
resource consumption of the actions (in terms of mean and 
standard deviation); the relative width of distributions is il- 
lustrated graphically as well. The local utility is the reward 
received when the action terminates successfully: in this ex- 
ample, this will be when the preconditions are met and when 
the energy resource is non-negative at the end of execution. 

In the example, consider the HiRes action. It has an ener- 
gy precondition E > 0.02 Ah and a time precondition of 9.00 
< t < 16:00. The expected energy usage is 0.01 Amp-hours 
(Ah) with a standard deviation of 0 Ah (so in this case there 
is no uncertainty in the model). The expected duration is 5 
seconds with a standard deviation of 1 second. The local 
utility of the action is v=10. 


Approaches 

There are several possible ways of attacking this problem of 
planning with continuous uncertain variables. In this sec- 
tion, we briefly discuss some of these, and the issues that 
arise. 

Computing the Optimal Value Function 

Figure 3 shows the optimal value function for the problem in 
Figure 2. The figure was computed by working backwards 
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Figure 3: Optimal value function for the example in Figure 2. The 
left axis is increasing energy from 0 to 20. The right axis is start 
* -M.nn 1 V?n Vertical axis is expected utility. 


from all possible activities that have positive reward and us- 
ing dynamic programming to construct the optimal plan. 
The curved hump where there is lots of power and time 
available corresponds to the primary plan, while the rectan- 
gular block corresponds to branching to the Rock finder plan 
and completing the NIR. The tail of the curved bump is a 
branch after the drive action to the HiRes plan. The flat sur- 
face with value 5 is again an immediate branch to the Rock- 
Finder plan, but in this area there is not enough power or 
time to complete the plan, and only the LoRes reward is re- 
ceived. Figure 5 shows a cross-section through this surface 
for power equal to 11, showing how the various branches 
contribute to the overall plan. Note that the utility of the 
overall plan is higher in some places than the value of any 
original branch. This is because future branch points allow 
us to wait and see whether a particular plan will succeed, and 
if it is unlikely to succeed, we can take an alternative branch. 

Given a detailed contingent plan and the distributions for 
time and resource usage, it is relatively straightforward to 
evaluate the expected utility of the plan. If the distributions 
are very simple, it may be possible to compute this quantity 
exactly; more generally, this will have to be done with sto- 
chastic simulation. Thus, if we could generate all possible 
contingent plans for a problem, we could evaluate each of 
them and choose the one with highest utility. Of course this 
is completely impractical for problems of any size, partly 
because it is impossible to enumerate the conditions for con- 
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Figure 4: Slice of the optimal value function for energy = 1 1 Ah, 
along with the component curves that contribute to the overall 
utility. 

ditional branches. The dynamic programming approach we 
took above is an alternative, but it too is computationally ex- 
pensive, and it fails completely when resource availability is 
not monotonically decreasing (because optimization can no 
longer be performed through a single backward pass). 

Heuristic Approaches 

One possibility is to try to plan for the worst case scenario. 
Thus, in the example from the last section, we could assume 
that the drive operation requires time and power that is one 
or perhaps even two standard deviations above the mean. 
The trouble is, this approach is overly conservative and 
leads to plans with less science gain than is typically possi- 
ble. In the example from the previous section, if plan execu- 
tion was expected to begin at 13:45, this approach would 
lead us to build a “safe” primary plan that replaces NIR 
with the HiRes action, with expected utility of 10 in all 
cases, instead of the more ambitious current primary plan, 
with expected utility_._QfrQ. in the worst, case, but 32 in the 
average case and 100 in the best case. 

A more ambitious approach to the problem would be to 
build an initial plan based on the expected behavior of vari- 
ous activities and then attempt to improve that plan by aug- 
menting it with contingent branches. This is the approach 
taken by Drummond, Bresina and Swanson with their Just- 
in-Case (JIC) telescope scheduling [11]. This approach is in- 
tuitively simple and appealing, but extending it to problems 
like the one we have outlined is non-trivial. The primary dif- 
ficulty is to decide where contingent branches should be 
added to a plan. In JIC scheduling, branches were added at 
the points with the greatest probability of plan failure. Given 
the distributions for time and resource usage this is relatively 
easy to calculate by statistical simulation of the plan. Unfor- 
tunately, the points most likely to fail are not necessarily the 
points where useful alternatives are available. The points of 
maximal failure probability may be too late in the plan to 
have enough time or power left for any useful alternative. 

Unfortunately, the problem of finding high utility 
branch points is non-trivial. Figure 5 shows the expected 


utility over time of the possible plans with a single branch, 
for a fixed starting energy of 11. Note that at earlier start 



Figure 5: Utility for a single branch at different possible branch 
points with energy = 11. 


times, the plans with the highest expected utility are those 
that postpone the decision to later in the primary plan, where 
the possibility of receiving the 100 reward for the NIR action 
can be more accurately assessed. In a small region, the ex- 
pected utility of the full RockFinder plan makes that plan 
more valuable. As time advances, the probability of succeed- 
ing in either the primary plan or the full RockFinder plan di- 
minishes, and the HiRes branch becomes the dominant plan. 
Without the HiRes branch, the early branch to the RockFind- 
er plan (slightly) dominates the other branches late in the 
time window, since delaying that branch may, with small 
probability, cause a failure due to energy, resulting in no util- 
ity. 

Finding the Branch Conditions 

Once we've decided to add a branch to a plan, there is still a 
problem ~of deciding -under what conditions- to take the 
branch. Once again, we could use dynamic programming to 
compute the optimal conditions, but this suffers from the 
problems we described above. In addition, as Figure 3 illus- 
trates, the optimal conditions can be extremely complex and 
hard to represent. The flat surfaces of utility 5 and 55 corre- 
spond to branching to the RockFinder plan before the first 
step of the primary plan. The primary plan (along with the 
later possible branch to the HiRes plan) is of higher expected 
utility where the surface is curved. The conditions for the 
branch point at the beginning of the primary plan are thus the 
boundaries between the curved surfaces and die flat surfac- 
es. The boundaries are in this case discontinuous, corre- 
sponding to a disjunctive condition 

It is important to bear in mind that the boundaries are 
generally places where the values of two different branches 
are equal, which means that approximate solutions will usu- 
ally be acceptable here. One possibility is to treat the contin- 
uous dimensions of the problem as independent, which 
results in rectangular regions. This works well in most cases, 
but the boundaries must be chosen with care where there are 
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abrupt edges in the value function. This approximation may 
also fail if there are dependencies between the dimensions, 
for example when the energy used for driving is dependent 
on the actual time spent, rather than being treated indepen- 
dently as in our example. 

Conclusions 

For a Mars rover, uncertainty is absolutely pervasive in the 
domain. There is uncertainty in the duration of many activi- 
ties, in the amount of power that will be used, in the amount 
of data storage that will be required, and in the location and 
orientation of the rover. Unfortunately, current techniques 
for planning under uncertainty are limited to simple models 
of time, and actions with discrete outcomes. In the rover do- 
main there is concurrent action, actions of differing dura- 
tion, and most of the uncertainty is associated with 
continuous quantities like time, power, position and orienta- 
tion. 

For any non-trivial problem, it seems unlikely that exact 
or optimal solutions will be possible. Nor do we have good 
heuristic techniques for generating effective contingent 
plans. It seems that new and dramatically different ap- 
proaches are needed to deal with this kind of problem. 
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